RESEARCH ARTICLE 



Mutations in Transcriptional Regulators Allow Selective Engineering 
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ABSTRACT Bacterial cells monitor their environment by sensing a set of signals. Typically, these environmental signals affect pro- 
moter activities by altering the activity of transcription regulatory proteins. Promoters are often regulated by more than one 
regulatory protein, and in these cases the relevant signals are integrated by certain logic. In this work, we study how single amino 
acid substitutions in a regulatory protein (GalR) affect transcriptional regulation and signal integration logic at a set of engi- 
neered promoters. Our results suggest that point mutations in regulatory genes allow independent evolution of regulatory logic 
at different promoters. 

IMPORTANCE Gene regulatory networks are built from simple building blocks, such as promoters, transcription regulatory pro- 
teins, and their binding sites on DNA. Many promoters are regulated by more than one regulatory input. In these cases, the in- 
puts are integrated and allow transcription only in certain combinations of input signals. Gene regulatory networks can be easily 
rewired, because the function of cis-regulatory elements and promoters can be altered by point mutations. In this work, we 
tested how point mutations in transcription regulatory proteins can affect signal integration logic. We found that such muta- 
tions allow context-dependent engineering of signal integration logic at promoters, further contributing to the plasticity of gene 
regulatory networks. 
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Many transcription regulatory proteins sense the level of small 
molecule signals and bind specific sites in the as-regulatory 
region of a denned set of genes, modifying their transcription. The 
logic of regulation at these genes depends on whether the tran- 
scription regulatory protein is an activator (positive control) or a 
repressor (negative control) and whether the small molecule sig- 
nal enhances or inhibits the protein's activity. However, expres- 
sion of many genes depends on more than one transcription reg- 
ulatory protein-signal molecule pair. In these cases, the combined 
effect of the incoming signals depends on how the transcription 
regulatory proteins affect each other's binding to DNA and inter- 
action with RNA polymerase. That is, the signal integration logic 
may be different from the sum of the logic observed in the case of 
the individual signals (1-5). The simplest cases for studying signal 
integration are the two-input systems. Sugar regulatory systems 
are classical examples for integration of two signals, one of which 
is a global signal for carbon starvation (cyclic AMP [cAMP] ), and 
the other is the specific sugar transported and metabolized by the 
system (5-9). In the presence of cAMP, the cAMP receptor pro- 
tein (CRP) can specifically bind to a 16-bp sequence and activate 
or repress transcription depending on the location of the binding 
site (3, 10). The intracellular concentration of a given sugar is 
sensed by a specific transcription regulatory protein. Binding of 
the sugar to the regulator typically induces an allosteric change in 
the protein's structure, altering its DNA binding properties (11, 
12). 

Transcription regulatory proteins in these systems typically 



possess surfaces for protein-protein, protein-DNA, and protein- 
small molecule interactions. Single amino acid substitutions allow 
engineering the binding characteristics of individual surfaces, 
leaving the other surfaces unchanged. Such changes can affect 
dimerization, tetramerization (13, 14), DNA binding specificity 
(15, 16), RNAP contact (17, 18), and inducer binding (19, 20). A 
special class of signal molecule binding mutants in the case of 
lactose repressor can bind to DNA only in the presence of the 
signal, as opposed to the wild-type (WT) protein which is inacti- 
vated by signal molecule binding ( 19) . Such point mutations allow 
reversion of the regulatory logic of the lac operon, which would 
otherwise require extensive rearrangement of the lac regulatory 
region (21). 

In this work, we used the galactose regulon of Escherichia coli as 
a model system to study how single amino acid substitutions in 
regulatory proteins affect signal integration logic. The gal regulon 
consists of five operons (galETKM, galP, mglBAC, galR, and galS) 
which are controlled by the galactose repressor (GalR). Promoters 
of these operons are also controlled by cAMP-CRP, except P ga i R , 
which is not affected by cAMP-CRP in vitro. Only two operons are 
transcribed in the absence of regulatory proteins in vitro (5, 22), 
galR and galETKM. The galETKM operon contains genes required 
for D-galactose (D-gal) metabolism. These genes are transcribed 
from two promoters, Pl ga m ar, d P2 galE , which are repressed by the 
Gal repressosome. Repressosome assembly requires (i) binding of 
two dimeric GalR proteins to two spatially separated operator 
elements, O e and O h (ii) negatively supercoiled DNA, (iii) opti- 
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FIG 1 Schematic drawing of the experimental system used for testing signal integration at promoters. The two inputs of the system are cAMP and D-galactose, 
which interact with CRP and GaIR, respectively. GaIR and cAMP-CRP can bind specific sequences in the regulatory regions of the promoters and influence 
transcription of the reporter gene (uidA). The logic of signal integration, i.e., the activity of the promoter at different combinations of signals, depends on the 
activity and on the combined action of the regulatory proteins. GaIR is expressed constitutively from a multicopy plasmid. 



mal angular orientation of the two operator sites, and (iv) specific 
binding of the architectural protein HU to a DNA site (hbs) in the 
interoperator region (23, 24). 

Here, we analyze the effect of 44 single amino acid substitu- 
tions on the performance of GaIR. The majority of these substitu- 
tions are neutral to repressosome-mediated transcription inhibi- 
tion. However, we find that even such "neutral" substitutions can 
affect signal integration logic in a regulatory context-dependent 
manner. 

RESULTS AND DISCUSSION 

Construction of the experimental system. The experimental sys- 
tem utilized to study signal integration logic at different promot- 
ers (Fig. 1) is similar to the system used by Hunziker et al. (3) 
except that the chromosomal galR gene was deleted and GaIR is 
supplied using a multicopy plasmid. The cells used are unable to 
produce cAMP due to a deletion in the cyaA gene; therefore, in- 
tracellular cAMP and galactose levels can be controlled by the 
addition of these molecules to the growth medium. 

Structures of the three promoters at which integration of the 
two external signals, cAMP and D-galactose, was studied are 
shown in Fig. 2. 

Effect of single amino acid substitutions on repressosome- 
mediated repression. In previous studies, we mutagenized the 
galR coding region in plasmid pSEM 1 077 to obtain mutations that 
result in single amino acid substitutions in the N-terminal third of 
GalR (15, 17). A collection of 44 substitutions was tested for 
repressosome-mediated repression of the Gate C promoter 
(Fig. 3). This promoter is active in the absence of GalR (Fig. 3, 



GalR - ). It can be repressed by DNA looping through represso- 
some formation but not by binding of individual GalR dimers to 
the operators. This is demonstrated by the activity of the promoter 
in the presence of GalR T322R (Fig. 3, T322R), a mutant that can 
bind operators similar to the wild type but is defective in tetramer- 
ization and thus unable to form a repressosome (13, 15, 25). 

Most of the substitutions allowed efficient repression of the 
reporter gene, indicated by the lack of blue color in the colonies in 
Fig. 3. Six amino acid substitutions allowed substantial reporter 
gene expression (T3I, D6G, V7A, A16T, S29D, and N48I). These 
are all located in the DNA binding headpiece of GalR (13) (Fig. 4). 
Because GalR is expressed constitutively from a multicopy plas- 
mid, the lack of repression most likely results from weaker DNA 
binding and not from decreased expression or stability of the mu- 
tant GalR proteins. Substitutions located close to or at the 
dimerization interface did not interfere with repression of the 
Gate C promoter, probably because the dimer is stabilized by a 
large number of interactions. 

Effect of single amino acid substitutions on signal integra- 
tion logic. Five substitutions which repressed the Gate C pro- 
moter efficiently were selected and studied in different regulatory 
contexts (Fig. 5). Two of these (K5E and V21A) are located in the 
DNA binding headpiece, while three are situated at the dimeriza- 
tion interface (D68E, D71E, and Q83P). Wild-type GalR and two 
previously characterized mutants were used as controls. These 
were Y244F, which is insensitive to D-galactose (20), and the te- 
tramerization mutant T322R. 

Two different signal integration patterns were observed in the 
case of Gate A, AND and FALSE (Fig. 5, left). This promoter is 
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FIG 2 Structures of promoters and regulatory regions used in this study. The sequences shown were inserted between the EcoRI and PstI sites located upstream 
of the uidA reporter gene and downstream of the trnBTl T2 terminators (T mB ). GalR and cAMP-CRP binding sites are marked blue and red, respectively. Sites 
which are recognized by both GalR and cAMP-CRP are marked by both colors. Arrowheads indicate transcriptional start sites. Gate A contains part of the galP 
promoter region and performs AND logic. Gate B performs D-gal NIMPLIES cAMP logic, while Gate C functions as a D-gal gate in the presence of wild- type GalR 
expressed from a multicopy plasmid. Signal integration at these three promoters (Fig. 5) is represented schematically on the top right. Blue color indicates that 
the reporter gene is expressed in a given combination of the two input signals, cAMP and D-galactose. White color indicates that the promoter of the reporter gene 
is inactive. 



based on the P gaW promoter, which was previously shown to per- 
form AND logic (5, 8). P gaW is inactive in the absence of cAMP- 
CRP. Because GalR binding inhibits cAMP-CRP-mediated activa- 
tion, both signals are required for transcription. Similar signal 
integration logic was observed in the presence of K5E, V21A, 
Q83P, and T322R as with WT GalR, indicating that these muta- 
tions allow normal DNA binding, D-galactose binding, and inhi- 
bition of cAMP-CRP. However, FALSE logic was observed in the 
presence of D68E and D71E, similar to Y244F. Repression of the 
promoter in the presence of both D-galactose and cAMP indicates 
that these mutants bind operators normally but DNA binding is 
not inhibited by D-galactose. Y244 was previously associated with 
one of the inducer binding segments (20). D68 and D71 are lo- 
cated relatively far from the predicted D-galactose binding cleft. 
Therefore, we speculate that substitutions at these positions inter- 
fere with the allosteric transition between the DNA binding and 
D-galactose binding states. 

Three different signal integration patterns were observed in the 
case of Gate B, which performs D-gal NIMPLIES cAMP (equiva- 



lent to D-gal AND NOT cAMP) logic operation in the presence of 
WT GalR (Fig. 5, middle column). This construct is based on the 
galETKM regulatory region. The — 10 element of the P2 gam pro- 
moter was replaced by the consensus sequence (TATAAT), and 
the Pl ga iE promoter was inactivated. The promoter is strongly re- 
pressed by both GalR-mediated DNA looping and cAMP-CRP 
binding. Therefore, transcription occurs only in the presence of 
D-galactose (inhibition of DNA looping) and absence of cAMP 
(cAMP-CRP binding is not allowed). K5E, V21A, and Q83P 
showed signal integration logic similar to that of WT GalR, al- 
though higher reporter expression was observed in the cases of 
K5E and V21A in the presence of D-galactose and absence of 
cAMP (Fig. 5, middle column, right bottom corners). TRUE logic 
was observed in the presence of T322R, i.e., the reporter showed 
strong expression at all combinations of input signals. This sub- 
stitution does not allow DNA loop formation but allows binding 
of individual dimers to the operators. In the Gate B construct, 
binding of a GalR dimer to the upstream operator activates the 
promoter, resulting in a high level of reporter expression. Such 




GalR" WT T3I I4L K5E D6G V7A V13A A16T T17S V21A P26A S29D E30A A31G S32T 




R33A L34A A35D V36A H37A S38G M40T E41A S42A Y45H H46L P47S N48I N50D R52H A53T 




D68E D71E P72L V81A E82A Q83P Y86D H87Q T88K F91L L92S L93S Y244F T322R 

FIG 3 Repression of the Gate C promoter by GalR mutants. Blue color results from successful conversion of the chromogenic substrate (X-gluc) by the UidA 
protein, indicating that the reporter gene is expressed, i.e., not repressed efficiently by a given mutant. 
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FIG 4 Location of the tested amino acid substitutions in the GalR dimer. Positions of substitutions which allowed strong repression of the Gate C promoter are 
colored yellow, while the ones which allowed substantial transcription of the promoter are marked blue. The red and green colors indicate positions of the Y244F 
and T322R substitutions, respectively. 



activation can occur even in the presence of D-galactose (25), be- 
cause the stability of the GalR-operator complex is reduced only 
7-fold by the presence of D-galactose in the complex (11). FALSE 
logic was observed with Y244F, D68E, and D71E due to the unin- 
ducible nature of these proteins. 

Signal integration at the Gate C promoter displayed the highest 
diversity (Fig. 5, right). This construct was made by introducing a 
set of mutations into the galETKM regulatory region (G-10A, 
A-7T, A2T, A5C, A7T, A24G, G54A, T55A, G60A, T62G, A63T, 
T66C, relative to the P2 galE transcription start site). These muta- 
tions inactivate the Pl ga i E promoter and allow cAMP-CRP bind- 
ing to the downstream GalR operator site (Fig. 2). In the presence 
of wild-type GalR provided from a multicopy plasmid, the pro- 
moter is active only in the presence of D-galactose, regardless of 
the presence of cAMP, i.e., it functions as a single input (D-gal) 
logic gate. Repression of the promoter requires DNA looping, be- 
cause the P2 galE promoter is activated by GalR when DNA loop 



formation is not allowed (26). Therefore, as expected, TRUE logic 
was observed in the case of the nonlooping mutant T322R, and 
FALSE logic was found in the case of the noninducible mutant 
Y244F (Fig. 5, right). Cells carrying K5E showed an expression 
pattern similar to that of wild-type GalR, but in the absence of 
D-galactose the reporter levels were higher than what was observed 
with WT GalR. However, OR logic was observed in the cases of 
V21A and Q83P, i.e., the promoter was active in the presence of 
any one of the signals and also when both signals were present. 
This observation suggests that cAMP-CRP can destabilize the 
GalR-mediated DNA loop by interfering with the binding of GalR 
to the downstream operator. Binding of these mutants to the up- 
stream operator seems to be unaffected by cAMP-CRP, because in 
the absence of GalR binding to the upstream operator cAMP-CRP 
would repress the P2 w promoter (26). The D68E- and D71E- 
mediated DNA loops were also sensitive to the presence of cAMP- 
CRP; however, in these cases, the signal integration logic resem- 
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FIG 5 Effect of seven single amino acid substitutions on the signal integration 
logic at three different promoters. Structures of the promoters are shown in 
Fig. 2. The activity of the reporter gene {uidA) was monitored in the four 
possible combinations of D-gal and cAMP, following the arrangement shown 
in the top left corner. Blue color results from successful conversion of the 
chromogenic substrate (X-gluc) by the UidA protein, indicating that the re- 
porter gene is expressed (ON) in a given combination of input signals. The lack 
of blue color indicates that expression is OFF. In case of WT GalR, the lower 
right corner (D-gal) represents the basal promoter activity (none of the regu- 
lators bind to DNA), whereas the lower left corner (no signals) represents 
promoter activity when GalR is bound. In the presence of cAMP (top left), 
both GalR and cAMP-CRP can bind DNA, whereas in the presence of both 
signals (top right), only cAMP-CRP can regulate the promoter. 



bled a single input switch that responds only to cAMP and not to 
D-gal. This result further confirms that these mutants are not in- 











ducible by D-galactose and also suggests that a single amino acid 
substitution can affect two distant ligand binding interfaces. 
Evolution of regulatory proteins and regulatory networks. 

Bacteria can tolerate radical changes in their gene regulatory net- 
works, which can be easily rewired by gain or loss of as-regulatory 
elements and thus can evolve rapidly ( 1-3, 27-30). The strength of 
specific protein-DNA interactions can be fine-tuned by mutations 
in the binding sites (31) to optimize the performance of the net- 
work. Analysis of evolutionary dynamics in prokaryotic networks 
revealed that transcription regulatory proteins and their target 
genes evolve relatively independently. Major phenotypical differ- 
ences between organisms are the result of changes in the regula- 
tory proteins rather than in the regulated gene repertoire. Also, the 
structure of the regulatory network reflects the lifestyle of the or- 
ganism better than its phylogenetic relations (32). These observa- 
tions are in line with a recent study which shows that accumula- 
tion of intermediary metabolites can cause cellular stress (33), 
suggesting that metabolic networks may have less plasticity than 
regulatory networks. 

Results presented in this work suggest that gene regulatory net- 
works can also be rewired by point mutations in the genes encod- 
ing transcription regulatory proteins. These mutations can change 
how metabolites are handled in certain conditions and can allow 
fast optimization of the network for a different lifestyle. 

Although regulatory proteins can become nonfunctional due 
to point mutations, certain regions of these proteins are tolerant 
for substitutions. For example, more than 44% of the amino acid 
positions are tolerant to substitutions in the Lac repressor (LacI) 
(19), which has a structure similar to GalR. The examples of the 
V2 1 A and Q83P substitutions in GalR suggest that changes in such 
positions can alter the signal integration logic in certain regulatory 
contexts without perturbing the logic in other contexts. These 
mutants regulate all three promoters shown in Fig. 2 the same way 
as wild-type GalR in the absence of cAMP-CRP, i.e., the promot- 
ers are repressed in the absence of D-galactose and transcription is 
allowed in the presence of D-galactose (Fig. 5). However, in the 
presence of cAMP-CRP, the regulatory logic becomes qualita- 
tively different from the wild type in the case of Gate C and re- 
mains the same for Gate A and Gate B. That is, a small quantitative 
difference in the affinity of a regulator to a given operator site can 
result in a qualitative change in the overall function of a network. 

Concluding remarks. In summary, we can conclude that mu- 
tations in transcription regulatory proteins allow context- 
dependent engineering of signal integration logic at promoters, 
contributing to the plasticity of gene regulatory networks. This 
plasticity adds an additional layer of complexity to 
bioinformatics-based network reconstruction, because sequences 
of transcription regulatory proteins are rarely identical in different 
organisms. 

MATERIALS AND METHODS 

Plasmid and strain construction. Synthetic regulatory regions were cre- 
ated by PCR and inserted upstream of the uidA open reading frame (ORF) 
on the chromosome of E. coli CH1200 cells (Acya854) ( 13) by the method 
described earlier (22). The galR gene in the obtained cells was replaced by 
a chloramphenicol resistance gene (Cm r ) according the protocol de- 
scribed by Datsenko and Wanner (34). The Cm r gene was PCR amplified 
using the primers galRCmup (5' CCAACGGGCGTT TTCCGTAACACT 
GAAAGAATGTAAGCGTTTACCCACTAAGGTATTTTCATGCCGTT 
ACGCACCACCCCGTC 3') andgalRCmdn (5' TCAGGCGCGGTTGAT 
TCGCCGTCGCCAGACCATCGAAGAATTACTGGCGCTGGAATTAC 
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GCCCCGCCCTGCCACTC 3') and pRFB122 as the template (35). 
Sequences of the constructed regions and their flanking DNAs were veri- 
fied (Eurofins MWG Operon). 

Derivatives of plasmid pSEM1077 carrying mutations in the galR gene 
were obtained in a previous mutagenesis study (17). Plasmid pSEM1077 
was created from pSEM1029 ( 14) by engineering a PvuII site in the GalR 
coding region ( 1 7). In these high-copy-number plasmids, the galR gene is 
transcribed constitutively from the synthetic pEM7 promoter. 

Screening the logic of gene regulation. Signal integration in the con- 
structed circuits was characterized by monitoring gene expression at four 
different combinations of D-galactose and cAMP, representing the four 
extremes of the two-dimensional input functions. We prepared 4-lb plates 
containing 100 /ug/ml ampicillin, 30 /ug/ml chloramphenicol, 50 fig/ml 
X-gluc (5-bromo-4-chloro-3-indolyl-beta-D-glucuronic acid; Fermen- 
tas), and (i) no D-galactose and no cAMP (no input signals), (ii) 8 mM 
D-galactose (D-gal = 1), (iii) 0.16 mM cAMP (cAMP = 1), and (iv) 8 mM 
D-gal and 0.16 mM cAMP (D-gal = 1 and cAMP = 1). We spotted 1 fjd of 
cell suspension on each LB agar plate. 

This characterization resembles Boolean logic, where inputs and out- 
puts are 0 (absent) or 1 (present at high concentration). The output was 
evaluated based on the color of the colonies. The criterion of Boolean-type 
integration is that the high and low states of reporter gene expression can 
be clearly distinguished. The presence of blue color ( 1 ) reflects expression 
of the reporter operon (uidABC), which is responsible for transport and 
metabolism of the chromogenic substrate X-gluc. In the absence of blue 
color, the output is 0. 

The logic gates were chosen to satisfy these criteria with wild-type 
GalR. The signal integration could be described by Boolean algebra in the 
presence of GalR mutants as well, except in the case of K5E and Gate C, 
where the high- and low-expression states could not be clearly distin- 
guished. 
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